Matrix group structure and Markov invariants in the strand symmetric phylogenetic substitution model.
نویسندگان
چکیده
We consider the continuous-time presentation of the strand symmetric phylogenetic substitution model (in which rate parameters are unchanged under nucleotide permutations given by Watson-Crick base conjugation). Algebraic analysis of the model's underlying structure as a matrix group leads to a change of basis where the rate generator matrix is given by a two-part block decomposition. We apply representation theoretic techniques and, for any (fixed) number of phylogenetic taxa L and polynomial degree D of interest, provide the means to classify and enumerate the associated Markov invariants. In particular, in the quadratic and cubic cases we prove there are precisely [Formula: see text] and [Formula: see text] linearly independent Markov invariants, respectively. Additionally, we give the explicit polynomial forms of the Markov invariants for (i) the quadratic case with any number of taxa L, and (ii) the cubic case in the special case of a three-taxon phylogenetic tree. We close by showing our results are of practical interest since the quadratic Markov invariants provide independent estimates of phylogenetic distances based on (i) substitution rates within Watson-Crick conjugate pairs, and (ii) substitution rates across conjugate base pairs.
منابع مشابه
The Strand Symmetric Model
Important special cases of strand symmetric Markov models are the groupbased phylogenetic models including the Jukes-Cantor model and the Kimura 2 and 3 parameter models. The general strand symmetric model or in this chapter just the strand symmetric model (SSM) has only these eight equalities of probabilities in the transition matrices and no further restriction on the transition probabilities...
متن کاملTying Up Loose Strands: Defining Equations of the Strand Symmetric Model
The strand symmetric model is a phylogenetic model designed to reflect the symmetry inherent in the double-stranded structure of DNA. We show that the set of known phylogenetic invariants for the general strand symmetric model of the three leaf claw tree entirely defines the ideal. This knowledge allows one to determine the vanishing ideal of the general strand symmetric model of any trivalent ...
متن کاملEvaluation of First and Second Markov Chains Sensitivity and Specificity as Statistical Approach for Prediction of Sequences of Genes in Virus Double Strand DNA Genomes
Growing amount of information on biological sequences has made application of statistical approaches necessary for modeling and estimation of their functions. In this paper, sensitivity and specificity of the first and second Markov chains for prediction of genes was evaluated using the complete double stranded DNA virus. There were two approaches for prediction of each Markov Model parameter,...
متن کاملSkewed Base Compositions, Asymmetric Transition Matrices, and Phylogenetic Invariants
Evolutionary inference methods that assume equal DNA base compositions and symmetric nucleotide substitution matrices, where these assumptions do not hold, are likely to group species on the basis of similar base compositions rather than true phylogenetic relationships. We propose an invariants-based method for dealing with this problem. An invariant QT of a tree T under a k-state Markov model,...
متن کاملMarkov invariants, plethysms, and phylogenetics (the long version)
We explore model-based techniques of phylogenetic tree inference exercising Markov invariants. Markov invariants are group invariant polynomials and are distinct from what is known in the literature as phylogenetic invariants, although we establish a commonality in some special cases. We show that the simplest Markov invariant forms the foundation of the Log-Det distance measure. We take as our...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of mathematical biology
دوره 73 2 شماره
صفحات -
تاریخ انتشار 2016